Reinforcement Learning Applied to a Differential Game
نویسندگان
چکیده
An application of reinforcement learning to a linear-quadratic, differential game is presented. The reinforcement learning system uses a recently developed algorithm, the residual-gradient form of advantage updating. The game is a Markov decision process with continuous time, states, and actions, linear dynamics, and a quadratic cost function. The game consists of two players, a missile and a plane; the missile pursues the plane and the plane evades the missile. Although a missile and plane scenario was the chosen test-bed, the reinforcement learning approach presented here is equally applicable to biologically based systems, such as a predator pursuing prey. The reinforcement learning algorithm for optimal control is modified for differential games to find the minimax point rather than the maximum. Simulation results are compared to the analytical solution, demonstrating that the simulated reinforcement learning system converges to the optimal answer. The performance of both the residual-gradient and non-residual-gradient forms of advantage updating and Q-learning are compared, demonstrating that advantage updating converges faster than Q-learning in all simulations. Advantage updating is also demonstrated to converge regardless of the time step duration; Q-learning is unable to converge as the time step duration grows small.
منابع مشابه
An Adaptive Learning Game for Autistic Children using Reinforcement Learning and Fuzzy Logic
This paper, presents an adapted serious game for rating social ability in children with autism spectrum disorder (ASD). The required measurements are obtained by challenges of the proposed serious game. The proposed serious game uses reinforcement learning concepts for being adaptive. It is based on fuzzy logic to evaluate the social ability level of the children with ASD. The game adapts itsel...
متن کاملResidual Advantage Learning Applied to a Differential Game
An application of reinforcement learning to a differential game is presented. The reinforcement learning system uses a recently developed algorithm, the residual form of advantage learning. The game is a Markov decision process (MDP) with continuous states and nonlinear dynamics. The game consists of two players, a missile and a plane; the missile pursues the plane and the plane evades the miss...
متن کاملRobust Trajectory Optimization: A Cooperative Stochastic Game Theoretic Approach
We present a novel trajectory optimization framework to address the issue of robustness, scalability and efficiency in optimal control and reinforcement learning. Based on prior work in Cooperative Stochastic Differential Game (CSDG) theory, our method performs local trajectory optimization using cooperative controllers. The resulting framework is called Cooperative Game-Differential Dynamic Pr...
متن کاملDevelopment of Reinforcement Learning Algorithm to Study the Capacity Withholding in Electricity Energy Markets
This paper addresses the possibility of capacity withholding by energy producers, who seek to increase the market price and their own profits. The energy market is simulated as an iterative game, where each state game corresponds to an hourly energy auction with uniform pricing mechanism. The producers are modeled as agents that interact with their environment through reinforcement learning (RL...
متن کاملA Reinforcement Learning Adaptive Fuzzy Controller for Differential Games
In this paper we develop a reinforcement fuzzy learning scheme for robots playing a differential game. Differential games are games played in continuous time, with continuous states and actions. Fuzzy controllers are used to approximate the calculation of future reinforcements of the game due to actions taken at a specific time. If an immediate reinforcement reward function is defined, we may u...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Adaptive Behaviour
دوره 4 شماره
صفحات -
تاریخ انتشار 1995